It is a given that in any computerized environment,
things will not be perfect. Machines break down. Programs have bugs.
Users make mistakes. Most systems are not optimized from the outset.
Imperfections should be expected as part of automation. It is a good
thing that in this day and age, tools are available to troubleshoot
problems, analyze performance, and implement changes to improve systems.
In particular, SQL Server and the Windows operating system provide the
mechanisms needed to perform standard database management
troubleshooting. Let us begin by looking at the diagnosis and resolution
of some common problems.
Tuning the Operating System and Hardware
Much of the
performance of any database system relies on the application and
database designers. The use of the network, processor, memory, and disk
storage can all be dictated by the type of database design and the use
of the applications operating against the database server. For this
reason, the operating system usually acts as only a starting point in
any performance analysis testing. If the hardware configuration is not
adequate and the operating system is not properly configured, the
database engine will not be able to respond optimally.
The first task that the
hardware and operating system serve in any database system is to provide
a means to store and operate the database software and objects. The
operating system is also responsible for reporting hardware and software
problems, as well as making it possible for you to monitor everything
executing on the machine.
Creating and Maintaining Statistics
The SQL Server Query
Optimizer is the database engine component of SQL Server 2005. As the
database engine, it oversees all data-related interaction. It is
responsible for generating the execution plans for any SQL operation. In
diagnosing a query, the Optimizer must decide on the most efficient
means of executing the query and interacting with the database objects.
SQL Server has a cost-based
optimizer that can be extremely sensitive to the information provided by
statistics. Without accurate and up-to-date statistical information,
SQL Server has a great deal of difficulty determining the best execution
plan for a particular query.
Often, non-updated statistics cause a system to slow and become unresponsive over time. Periodically checking and using UPDATE STATISTICS on most production systems is warranted.
SQL Server goes
through a considerable process when it chooses one execution plan out of
several possible methods of executing a given operation. This
optimization is one of the most important components of a SQL Server
database system. Although the Optimizer’s analysis process incurs some
overhead, that overhead is saved in execution.
The Optimizer uses a
cost-based analysis procedure. Each possible method of execution has an
associated cost, which is determined in terms of the approximate amount
of computing resources used in execution. The Query Optimizer must
analyze the possible plans and choose the one that has the lowest
estimated cost.
It is not uncommon for some complex SELECT
statements to have thousands of possible plans. Of course, in such a
case, the Optimizer does not analyze every possible combination. It uses
a complex series of processes to find a plan that has a cost that is
reasonably close to the minimum—a minimum that is only theoretical and
unlikely to be achieved.
The SQL Server
Optimizer generally makes the best choice for any given execution. It is
possible to override the Optimizer’s choices by using code and forcing
the execution with hints; however, this is not recommended without
significant testing. When a query using hints performs better, you
usually implement additional indexing instead of using the hint in
production.
Locks, Blocks, and Deadlocks
One of the
hallmarks of a true database management system (DBMS) is whether it has
the capability to handle more than one user performing simultaneous data
modifications. The problem is that when several users in a database
make changes, it is likely that they will eventually want to update the
same record(s) at the same time. To avoid the problems this would cause,
SQL Server and most other DBMSs provide a locking mechanism.
A locking mechanism
provides a way to check out a particular row or set of rows from the
database so that they cannot be changed by another user until the
connection is finished and the changes are made. For connections that
are reading data, locking provides a mechanism to prevent other
connections from changing the data for the duration of the read or
longer. There are two basic types of locks:
Shared locks—
A shared lock occurs when a user is trying to read a row of data; for
some duration, depending on the transaction isolation level , the user owns a shared lock on the table. Because the user is
just trying to read the record, there can be several shared locks on
the row, so many people can read the same record at the same time.
Exclusive locks—
A user obtains an exclusive lock when he or she needs to change the
row. Exclusive locks are not shared; there can be only one user with an
exclusive lock on a row at any given time. If a user needs to acquire an
exclusive lock to a row that is already locked by another user, lock
contention occurs. Some level of contention is normal in a database that
is frequently updated. Typically, an application waits for some
arbitrary amount of time for the locks to clear and the transaction to
complete. This results in an apparent slowdown of the application and
the server, and excessive amounts of contention lead to performance
degradation and possibly user complaints.
Reducing Lock Contention
There
are a few things you can do to reduce lock contention. First, you can
make transactions as simple as possible and keep extraneous logic out of
transactions. The best case is when you do all the gathering of data
and validation of that data outside the transaction, and the transaction
is used only to update and insert rows. Second, you should make sure
that the application does not have any transactions that wait for user
input because users tend to do such things as go to lunch while they
have windows open, waiting for them to enter data to complete their
transactions.
In general, you should try
to collect all the data at once, and then you should start the
transaction, make the changes, and commit the changes. You should design
applications and databases with concurrency in mind. You should keep
tables that are frequently updated small by moving columns that don’t
belong in the table or that aren’t changed very often into another
table. If a table is going to be updated frequently, you should make
sure that it isn’t indexed more than necessary. Data modification
statements, such as INSERT, UPDATE, and DELETE, have to change the indexes as they go, so having too many indexes on a table requires them to modify several indexes.
With contention also comes a circumstance known as a deadlock.
A deadlock occurs when two processes are locking resources and each one
wants the resource that the other has locked. Deadlocks occur when two
or more transactions cannot complete because of mutual locks. SQL Server
detects a deadlock and rather randomly kills one of the user processes.
Levels of Locks
In addition to using
shared and exclusive locks, SQL Server also locks objects at different
levels. SQL Server can lock a single row of a table, a single data page,
or an entire table.
Typically, SQL Server
operates in the page lock mode, in which it locks the data pages being
requested. After a certain amount of blocking is noticed, SQL Server
slips into a row locking mode, in which single rows are locked.
On
the other end of the scale, when a connection attempts to update a
certain percentage of a table, SQL Server automatically escalates to a
table lock, in which it automatically locks the entire table either
exclusively (in the case of a full table update) or in a shared mode (in
the case of a full table read). SQL Server also determines lock
escalation based on the activity occurring in the table at the time of
the lock request. If the activity level is low, SQL Server saves itself
some time by escalating the lock sooner so that it has less effect on
other users.
There are shared page
locks, shared row locks, and shared table locks for reads, along with
exclusive page locks, exclusive row locks, and exclusive table locks for
writes. Locks and the control of locking behavior are important in
DBMSs. To aid in management over a DBMS, you can also implement
transactions that directly affect locking behavior. The following
section explains how to do this.
Diagnosing Lock Problems
You can diagnose
lock, block, and deadlock problems by using the System Monitor (How
many are occurring?), Activity Monitor (What is currently occurring?),
and SQL Server Profiler (What series of events causes the problem?). A
number of stored procedures, dynamic management views or functions, and
system tables can also provide information about locking.
Locking-Related Stored Procedures
The sp_lock
stored procedure reports snapshot information about locks, including the
object ID, index ID, type of lock, and type or resource to which the
lock applies. The sp_lock procedure is
also a Transact-SQL (T-SQL) alternative that you can use to view lock
activity in the Activity Monitor in SQL Server Management Studio. The sp_lock
procedure is provided only for backward compatibility and is considered
a deprecated function. You may want to query locking information from
dynamic management views as an alternative.
Locking-Related Dynamic Management Views
You can query the sys.dm_tran_locks
dynamic management view to obtain information about the current state
of locking. This is a preferred source of information in SQL Server
2005, although, as with all other dynamic views, it is supported only
within the current version of SQL Server.
You can use the sys.dm_exec_sessions view to retrieve information about processes currently running on the server, and you can use sys.dm_exec_requests to get further information about procedures blocking an execution request.
SQL Server Profiler Lock Events
You can use SQL Server Profiler to trap several lock events:
Deadlock Graph Event Class— Provides an XML description of a deadlock.
Lock:Acquired Event Class— Indicates whether a lock has been acquired on a resource.
Lock:Cancel Event Class— Tracks requests for locks that were canceled before the locks were acquired.
Lock:Deadlock Chain Event Class— Is used to monitor when deadlock conditions occur and which objects are involved.
Lock:Deadlock Event Class— Tracks when a transaction has requested a lock on a resource already locked by another transaction, resulting in a deadlock.
Lock:Escalation Event Class— Indicates that a finer-grained lock has been converted to a coarser-grained lock.
Lock:Released Event Class— Tracks when a lock is released.
Lock:Timeout (timeout > 0) Event Class— Tracks when lock requests cannot be completed because another transaction has a blocking lock on the requested resource.
Lock:Timeout Event Class— Tracks when lock requests cannot be completed because another transaction has a blocking lock on the requested resource.
It is a good idea to
trap most of these events if you are using SQL Server Profiler as a
means of diagnosing locking and deadlocks in applications.
System Monitor Lock Counters
The SQLServer:Locks
object provides information on locking. Minimizing locks increases
concurrency, which can improve performance. You can monitor multiple
instances of the Locks object at the same time. The following are the available SQL Server Locks counters:
Average Wait Time (ms)— Monitors the average time (in milliseconds) for each lock request.
Lock Requests/sec— Monitors the number of new locks and lock conversions per second.
Lock Timeouts (timeout > 0)/sec —Monitors the number of lock requests per second that timed out, excluding NOWAIT locks.
Lock Timeouts/sec— Monitors the number of lock requests per second that timed out, including NOWAIT locks.
Lock Wait Time (ms)— Monitors the total wait time (in milliseconds) for locks.
Lock Waits/sec— Monitors the number of lock requests per second that required the caller to wait.
Number of Deadlocks/sec— Monitors the number of lock requests per second that resulted in deadlock.
Obviously, you want the
number of deadlocks per second to be non-existent, but this is the most
common counter utilized when monitoring locking behavior.